test 0
ev 1+ev +|S|wq ev 1+ev =0. Solvingtheequation,wehave
Note that computing bR value can be done in constant time ifWp and Wn values are given. We stress that this result holds for any loss functionℓ satisfying ℓ(v,y) > ℓ(y,y) 0, with v =y. We performed additional experiments to empirically investigate the difference between uPU and nnPU risk estimators in regards to overfitting. In Table 11 we report the training risks (measured 19 asPUriskasdataisPU)andtesting risks(measured asPNriskasdataisPN)using zero-one loss ℓ0/1(v,y)=(1 sign(vy))/2onanumberofdatasets. From the results we can see that the training risk issignificantly smaller than the test risk in the uPU setting as compared to the nnPU setting, confirming that uPU suffers more from overfittingthannnPU. Table11: TrainingandtestingriskofPUET. Figure 4shows that the normalized risk reduction importance makes manymore pixels more important.
- Europe > Switzerland > Zürich > Zürich (0.14)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
- Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
- (2 more...)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
- Law (0.93)
- Information Technology (0.93)
- (2 more...)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
- Information Technology > Data Science (0.67)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Asia > China > Liaoning Province > Shenyang (0.04)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
- Europe > Switzerland > Zürich > Zürich (0.14)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
- Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
- (2 more...)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
- Law (0.93)
- Information Technology (0.93)
- (2 more...)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
- Information Technology > Data Science (0.67)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Asia > China > Liaoning Province > Shenyang (0.04)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
- Materials > Chemicals > Industrial Gases > Liquified Gas (1.00)
- Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (1.00)
- Energy > Oil & Gas > Midstream (1.00)
AIRwaves at CheckThat! 2025: Retrieving Scientific Sources for Implicit Claims on Social Media with Dual Encoders and Neural Re-Ranking
Ashbaugh, Cem, Baumgärtner, Leon, Gress, Tim, Sidorov, Nikita, Werner, Daniel
Linking implicit scientific claims made on social media to their original publications is crucial for evidence-based fact-checking and scholarly discourse, yet it is hindered by lexical sparsity, very short queries, and domain-specific language. Team AIRwaves ranked second in Subtask 4b of the CLEF-2025 CheckThat! Lab with an evidence-retrieval approach that markedly outperforms the competition baseline. The optimized sparse-retrieval baseline(BM25) achieves MRR@5 = 0.5025 on the gold label blind test set. To surpass this baseline, a two-stage retrieval pipeline is introduced: (i) a first stage that uses a dual encoder based on E5-large, fine-tuned using in-batch and mined hard negatives and enhanced through chunked tokenization and rich document metadata; and (ii) a neural re-ranking stage using a SciBERT cross-encoder. Replacing purely lexical matching with neural representations lifts performance to MRR@5 = 0.6174, and the complete pipeline further improves to MRR@5 = 0.6828. The findings demonstrate that coupling dense retrieval with neural re-rankers delivers a powerful and efficient solution for tweet-to-study matching and provides a practical blueprint for future evidence-retrieval pipelines.
- Europe > Spain > Galicia > Madrid (0.04)
- Asia > China > Hong Kong (0.04)
- South America > Colombia > Meta Department > Villavicencio (0.04)
- (3 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study > Negative Result (0.64)
- Materials > Chemicals > Industrial Gases > Liquified Gas (1.00)
- Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (1.00)
- Energy > Oil & Gas > Midstream (1.00)
- Health & Medicine (0.68)
Decorrelated feature importance from local sample weighting
Fröhlich, Benedikt, Durst, Alison, Behr, Merle
Feature importance (FI) statistics provide a prominent and valuable method of insight into the decision process of machine learning (ML) models, but their effectiveness has well-known limitations when correlation is present among the features in the training data. In this case, the FI often tends to be distributed among all features which are in correlation with the response-generating signal features. Even worse, if multiple signal features are in strong correlation with a noise feature, while being only modestly correlated with one another, this can result in a noise feature having a distinctly larger FI score than any signal feature. Here we propose local sample weighting (losaw) which can flexibly be integrated into many ML algorithms to improve FI scores in the presence of feature correlation in the training data. Our approach is motivated from inverse probability weighting in causal inference and locally, within the ML model, uses a sample weighting scheme to decorrelate a target feature from the remaining features. This reduces model bias locally, whenever the effect of a potential signal feature is evaluated and compared to others. Moreover, losaw comes with a natural tuning parameter, the minimum effective sample size of the weighted population, which corresponds to an interpretation-prediction-tradeoff, analog to a bias-variance-tradeoff as for classical ML tuning parameters. We demonstrate how losaw can be integrated within decision tree-based ML methods and within mini-batch training of neural networks. We investigate losaw for random forest and convolutional neural networks in a simulation study on settings showing diverse correlation patterns. We found that losaw improves FI consistently. Moreover, it often improves prediction accuracy for out-of-distribution, while maintaining a similar accuracy for in-distribution test data.
- Europe > Germany > Bavaria > Regensburg (0.04)
- Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (2 more...)